Group 18: FINDING GENE PATTERNS IN BREAST CANCER DATA
Florencia De Lillo s242869
Nikolas Alexander Mumm s242825
Rodrigo Gallegos Dextre s243563
Introduction:
Question: Which genes are differentially expressed in different subtypes of cancer?
General workflow
![]()
General wokflow
EXPLORATORY ANALYSIS AND TIDY:
![]()
Cleaning procedure
EXPLORATORY ANALYSIS AND TIDY:
![]()
Ratio between male and female
EXPLORATORY ANALYSIS AND TIDY:
![]()
Age of female patients stratified by cancer status
DESEQ Analysis:
![]()
DESEQ workflow
PCA Analysis:
Here is an analysis of PCA plots showing the scree and cumulative variance explained.
The high dimentionality required to explain 85% of the variability of the data shows that cancer analysis is a difficult task.
PCA Analysis:
- The highlighted genes for each PC might be linked to specific biological pathways or processes, as they represent the main drivers of variance for the data.
- The PCA shows significant overlap between “TUMOR FREE” and “WITH TUMOR”, indicating no clear separation of cancer statuses.
Results:
- We could define genes that are significantly up or down regulated between patients groups using DESeq
- We used fgsea to find the 10 most different gene sets:
- angiogensis is strongly up-regulated
- immune response is strongly down-regulated
- “Interesting insights” we got:
- The pathway for cellular response to hydrogen peroxide is up-regulated
- Most down-regulated pathway is the sensory perception of taste
Discussion:
- Analysis of breast cancer on the level of gen expression is a complex problem
- Many genes (and gene sets) are involved
- Computational tools show a great potential to give further insights